Sinhala Grapheme-to-Phoneme Conversion and Rules for Schwa Epenthesis

نویسندگان

  • Asanka Wasala
  • Ruvan Weerasinghe
  • Kumudu Gamage
چکیده

This paper describes an architecture to convert Sinhala Unicode text into phonemic specification of pronunciation. The study was mainly focused on disambiguating schwa-/\/ and /a/ vowel epenthesis for consonants, which is one of the significant problems found in Sinhala. This problem has been addressed by formulating a set of rules. The proposed set of rules was tested using 30,000 distinct words obtained from a corpus and com-pared with the same words manually transcribed to phonemes by an expert. The Grapheme-to-Phoneme (G2P) con-version model achieves 98 % accuracy.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Grapheme-to-Phoneme Conversion for Amharic Text-to-Speech System

Developing correct Grapheme-to-Phoneme (GTP) conversion method is a central problem in text-tospeech synthesis. Particularly, deriving phonological features which are not shown in orthography is challenging. In the Amharic language, geminates and epenthetic vowels are very crucial for proper pronunciation but neither is shown in orthography. This paper describes an architecture, a preprocessing...

متن کامل

Dialect variation in Boro Language and Grapheme-to-Phoneme conversion rules to handle lexical lookup fails in Boro TTS System

It is not possible to include all the words in a natural language for general text-to-speech system. Grapheme-tophoneme conversion system is essential to pronounce a word which is out of vocabulary. Grapheme-to-phoneme rules play a vital role where lexical lookup fails. Though basic Grapheme-tophoneme rules system is very simple yet it is very powerful for naturalness of a TTS system. Letter-to...

متن کامل

Decision Tree Learning for Automatic Grapheme to Phoneme Conversion for Tamil N.Udhyakumar, C.S.Kumar, R.Srinivasan and R.Swaminathan

This paper describes a novel approach for grapheme to phoneme conversion using decision tree learning technique. The proposed approach, unlike the rule based approach, can generate rules spanning wider context and thus give better accuracy for the conversion.

متن کامل

Phonological variation: epenthesis and deletion of schwa in Dutch

Two types of phonological variation in Dutch, resulting from optional rules, are schwa epenthesis and schwa deletion. In a lexical decision experiment it was investigated whether the phonological variants were processed similarly to the standard forms. It was found that the two types of variation patterned differently. Words with schwa epenthesis were processed faster and more accurately than t...

متن کامل

A Diachronic Approach for Schwa Deletion in Indo Aryan Languages

Schwa deletion is an important issue in grapheme-to-phoneme conversion for IndoAryan languages (IAL). In this paper, we describe a syllable minimization based algorithm for dealing with this that outperforms the existing methods in terms of efficiency and accuracy. The algorithm is motivated by the fact that deletion of schwa is a diachronic and sociolinguistic phenomenon that facilitates faste...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006